NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Personalized mood prediction from patterns of behavior collected with smartphones

https://doi.org/10.1038/s41746-024-01035-6

Balliu, Brunilda; Douglas, Chris; Seok, Darsol; Shenhav, Liat; Wu, Yue; Chatzopoulou, Doxa; Kaiser, William; Chen, Victor; Kim, Jennifer; Deverasetty, Sandeep; et al (December 2024, npj Digital Medicine)

Abstract Over the last ten years, there has been considerable progress in using digital behavioral phenotypes, captured passively and continuously from smartphones and wearable devices, to infer depressive mood. However, most digital phenotype studies suffer from poor replicability, often fail to detect clinically relevant events, and use measures of depression that are not validated or suitable for collecting large and longitudinal data. Here, we report high-quality longitudinal validated assessments of depressive mood from computerized adaptive testing paired with continuous digital assessments of behavior from smartphone sensors for up to 40 weeks on 183 individuals experiencing mild to severe symptoms of depression. We apply a combination of cubic spline interpolation and idiographic models to generate individualized predictions of future mood from the digital behavioral phenotypes, achieving high prediction accuracy of depression severity up to three weeks in advance (R²≥ 80%) and a 65.7% reduction in the prediction error over a baseline model which predicts future mood based on past depression severity alone. Finally, our study verified the feasibility of obtaining high-quality longitudinal assessments of mood from a clinical population and predicting symptom severity weeks in advance using passively collected digital behavioral data. Our results indicate the possibility of expanding the repertoire of patient-specific behavioral measures to enable future psychiatric research.
more » « less
Full Text Available
Methylation risk scores are associated with a collection of phenotypes within electronic health record systems

https://doi.org/10.1038/s41525-022-00320-1

Thompson, Mike; Hill, Brian L.; Rakocz, Nadav; Chiang, Jeffrey N.; Geschwind, Daniel; Sankararaman, Sriram; Hofer, Ira; Cannesson, Maxime; Zaitlen, Noah; Halperin, Eran (December 2022, npj Genomic Medicine)

Abstract Inference of clinical phenotypes is a fundamental task in precision medicine, and has therefore been heavily investigated in recent years in the context of electronic health records (EHR) using a large arsenal of machine learning techniques, as well as in the context of genetics using polygenic risk scores (PRS). In this work, we considered the epigenetic analog of PRS, methylation risk scores (MRS), a linear combination of methylation states. We measured methylation across a large cohort ( n = 831) of diverse samples in the UCLA Health biobank, for which both genetic and complete EHR data are available. We constructed MRS for 607 phenotypes spanning diagnoses, clinical lab tests, and medication prescriptions. When added to a baseline set of predictive features, MRS significantly improved the imputation of 139 outcomes, whereas the PRS improved only 22 (median improvement for methylation 10.74%, 141.52%, and 15.46% in medications, labs, and diagnosis codes, respectively, whereas genotypes only improved the labs at a median increase of 18.42%). We added significant MRS to state-of-the-art EHR imputation methods that leverage the entire set of medical records, and found that including MRS as a medical feature in the algorithm significantly improves EHR imputation in 37% of lab tests examined (median R 2 increase 47.6%). Finally, we replicated several MRS in multiple external studies of methylation (minimum p -value of 2.72 × 10 −7 ) and replicated 22 of 30 tested MRS internally in two separate cohorts of different ethnicity. Our publicly available results and weights show promise for methylation risk scores as clinical and scientific tools.
more » « less
Full Text Available
Evaluating supervised and unsupervised background noise correction in human gut microbiome data

https://doi.org/10.1371/journal.pcbi.1009838

Briscoe, Leah; Balliu, Brunilda; Sankararaman, Sriram; Halperin, Eran; Garud, Nandita R. (February 2022, PLOS Computational Biology)
Segata, Nicola (Ed.)
The ability to predict human phenotypes and identify biomarkers of disease from metagenomic data is crucial for the development of therapeutics for microbiome-associated diseases. However, metagenomic data is commonly affected by technical variables unrelated to the phenotype of interest, such as sequencing protocol, which can make it difficult to predict phenotype and find biomarkers of disease. Supervised methods to correct for background noise, originally designed for gene expression and RNA-seq data, are commonly applied to microbiome data but may be limited because they cannot account for unmeasured sources of variation. Unsupervised approaches address this issue, but current methods are limited because they are ill-equipped to deal with the unique aspects of microbiome data, which is compositional, highly skewed, and sparse. We perform a comparative analysis of the ability of different denoising transformations in combination with supervised correction methods as well as an unsupervised principal component correction approach that is presently used in other domains but has not been applied to microbiome data to date. We find that the unsupervised principal component correction approach has comparable ability in reducing false discovery of biomarkers as the supervised approaches, with the added benefit of not needing to know the sources of variation apriori. However, in prediction tasks, it appears to only improve prediction when technical variables contribute to the majority of variance in the data. As new and larger metagenomic datasets become increasingly available, background noise correction will become essential for generating reproducible microbiome analyses.
more » « less
Full Text Available
Leveraging genomic diversity for discovery in an electronic health record linked biobank: the UCLA ATLAS Community Health Initiative

https://doi.org/10.1186/s13073-022-01106-x

Johnson, Ruth; Ding, Yi; Venkateswaran, Vidhya; Bhattacharya, Arjun; Boulier, Kristin; Chiu, Alec; Knyazev, Sergey; Schwarz, Tommer; Freund, Malika; Zhan, Lingyu; et al (December 2022, Genome Medicine)

Abstract Background Large medical centers in urban areas, like Los Angeles, care for a diverse patient population and offer the potential to study the interplay between genetic ancestry and social determinants of health. Here, we explore the implications of genetic ancestry within the University of California, Los Angeles (UCLA) ATLAS Community Health Initiative—an ancestrally diverse biobank of genomic data linked with de-identified electronic health records (EHRs) of UCLA Health patients ( N =36,736). Methods We quantify the extensive continental and subcontinental genetic diversity within the ATLAS data through principal component analysis, identity-by-descent, and genetic admixture. We assess the relationship between genetically inferred ancestry (GIA) and >1500 EHR-derived phenotypes (phecodes). Finally, we demonstrate the utility of genetic data linked with EHR to perform ancestry-specific and multi-ancestry genome and phenome-wide scans across a broad set of disease phenotypes. Results We identify 5 continental-scale GIA clusters including European American (EA), African American (AA), Hispanic Latino American (HL), South Asian American (SAA) and East Asian American (EAA) individuals and 7 subcontinental GIA clusters within the EAA GIA corresponding to Chinese American, Vietnamese American, and Japanese American individuals. Although we broadly find that self-identified race/ethnicity (SIRE) is highly correlated with GIA, we still observe marked differences between the two, emphasizing that the populations defined by these two criteria are not analogous. We find a total of 259 significant associations between continental GIA and phecodes even after accounting for individuals’ SIRE, demonstrating that for some phenotypes, GIA provides information not already captured by SIRE. GWAS identifies significant associations for liver disease in the 22q13.31 locus across the HL and EAA GIA groups (HL p -value=2.32×10 −16 , EAA p -value=6.73×10 −11 ). A subsequent PheWAS at the top SNP reveals significant associations with neurologic and neoplastic phenotypes specifically within the HL GIA group. Conclusions Overall, our results explore the interplay between SIRE and GIA within a disease context and underscore the utility of studying the genomes of diverse individuals through biobank-scale genotyping linked with EHR-based phenotyping.
more » « less
Full Text Available
BATMAN: Fast and Accurate Integration of Single-Cell RNA-Seq Datasets via Minimum-Weight Matching

https://doi.org/10.1016/j.isci.2020.101185

Mandric, Igor; Hill, Brian L.; Freund, Malika K.; Thompson, Michael; Halperin, Eran (June 2020, iScience)
null (Ed.)
Full Text Available
Compositional Lotka-Volterra describes microbial dynamics in the simplex

https://doi.org/10.1371/journal.pcbi.1007917

Joseph, Tyler A.; Shenhav, Liat; Xavier, Joao B.; Halperin, Eran; Pe’er, Itsik (May 2020, PLOS Computational Biology)
Dakos, Vasilis (Ed.)
Full Text Available
Automated identification of clinical features from sparsely annotated 3-dimensional medical imaging

https://doi.org/10.1038/s41746-021-00411-w

Rakocz, Nadav; Chiang, Jeffrey N.; Nittala, Muneeswar G.; Corradetti, Giulia; Tiosano, Liran; Velaga, Swetha; Thompson, Michael; Hill, Brian L.; Sankararaman, Sriram; Haines, Jonathan L.; et al (December 2021, npj Digital Medicine)
null (Ed.)
Abstract One of the core challenges in applying machine learning and artificial intelligence to medicine is the limited availability of annotated medical data. Unlike in other applications of machine learning, where an abundance of labeled data is available, the labeling and annotation of medical data and images require a major effort of manual work by expert clinicians who do not have the time to annotate manually. In this work, we propose a new deep learning technique (SLIVER-net), to predict clinical features from 3-dimensional volumes using a limited number of manually annotated examples. SLIVER-net is based on transfer learning, where we borrow information about the structure and parameters of the network from publicly available large datasets. Since public volume data are scarce, we use 2D images and account for the 3-dimensional structure using a novel deep learning method which tiles the volume scans, and then adds layers that leverage the 3D structure. In order to illustrate its utility, we apply SLIVER-net to predict risk factors for progression of age-related macular degeneration (AMD), a leading cause of blindness, from optical coherence tomography (OCT) volumes acquired from multiple sites. SLIVER-net successfully predicts these factors despite being trained with a relatively small number of annotated volumes (hundreds) and only dozens of positive training examples. Our empirical evaluation demonstrates that SLIVER-net significantly outperforms standard state-of-the-art deep learning techniques used for medical volumes, and its performance is generalizable as it was validated on an external testing set. In a direct comparison with a clinician panel, we find that SLIVER-net also outperforms junior specialists, and identifies AMD progression risk factors similarly to expert retina specialists.
more » « less
Full Text Available
Scalable probabilistic PCA for large-scale genetic variation data

https://doi.org/10.1371/journal.pgen.1008773

Agrawal, Aman; Chiu, Alec M.; Le, Minh; Halperin, Eran; Sankararaman, Sriram; Gravel, Simon (May 2020, PLOS Genetics)

Full Text Available
CONFINED: distinguishing biological from technical sources of variation by leveraging multiple methylation datasets

https://doi.org/10.1186/s13059-019-1743-y

Thompson, Mike; Chen, Zeyuan Johnson; Rahmani, Elior; Halperin, Eran (December 2019, Genome Biology)

Full Text Available
Stochasticity constrained by deterministic effects of diet and age drive rumen microbiome assembly dynamics

https://doi.org/10.1038/s41467-020-15652-8

Furman, Ori; Shenhav, Liat; Sasson, Goor; Kokou, Fotini; Honig, Hen; Jacoby, Shamay; Hertz, Tomer; Cordero, Otto X.; Halperin, Eran; Mizrahi, Itzhak (December 2020, Nature Communications)

Full Text Available

« Prev Next »

Search for: All records